160 research outputs found

    Digitizing Intangible Cultural Heritage

    Get PDF
    As part of the UNESCO project "Establishment of a National Inventory and Electronic Database of Lithuanian Intangible Cultural Heritage" the authors, representing the EU-funded project "European Cultural Heritage Online" (ECHO) were invited to give a course in digital archiving called "Digitizing Intangible Cultural Heritage" in Vilnius, Lithuania, March 15 to 20, 2004. The present report summarizes very briefly the sessions given. Thereafter, the analyses of the state of the digitization work of the participating institutes and recommendations for the future are given in a dedicated, stand-alone section

    Psycholinguistik

    Get PDF
    5.1 Einführung in den Forschungsbereich Die Psycholinguistik ist der Bereich der Linguistik, der sich mit dem Zusammenhang zwischen menschlicher Sprache und dem Denken und anderen mentalen Prozessen beschäftigt, d.h. sie stellt sich einer Reihe von essentiellen Fragen wie etwa (1) Wie schafft es unser Gehirn, im Wesentlichen akustische und visuelle kommunikative Informationen zu verstehen und in mentale Repräsentationen umzusetzen? (2) Wie kann unser Gehirn einen komplexen Sachverhalt, den wir anderen übermitteln wollen, in eine von anderen verarbeitbare Sequenz von verbalen und nonverbalen Aktionen umsetzen? (3) Wie gelingt es uns, in den verschiedenen Phasen des Lebens Sprachen zu erlernen? (4) Sind die kognitiven Prozesse der Sprachverarbeitung universell, obwohl die Sprachsysteme derart unterschiedlich sind, dass sich in den Strukturen kaum Universalien finden lassen

    WP2 Report from the ECHO IT Days

    Get PDF
    Abstract not availabl

    Towards a European Collaborative Data Infrastructure

    Get PDF
    The EUDAT project is a pan-European data initiative that started in October 2011. The project brings together a unique consortium of 25 partners - including research communities, national data and high performance computing (HPC) centres, technology providers, and funding agencies - from 13 countries. EUDAT aims to build a sustainable cross-disciplinary and cross-national data infrastructure that provides a set of shared services for accessing and preserving research data. The design and deployment of these services is being coordinated by multi-disciplinary task forces comprising representatives from research communities and data centres. This short paper presents the achievements of the project during its first year and describes the services that have been chosen to meet the requirements of the initial research communities involved in the project.CSC — IT Center for Science Ltd., FI-02101 Espoo, Finland, SARA, Science Park 140, 1098 XG Amsterdam, The Netherlands, Max Planck Institute for Psycholinguistics, PO Box 310, 6500 AH Nijmegen, The Netherlands

    Evaluation of Application Possibilities for Packaging Technologies in Canonical Workflows

    Get PDF
    In Canonical Workflow Framework for Research (CWFR) “packages” are relevant in two different directions. In data science, workflows are in general being executed on a set of files which have been aggregated for specific purposes, such as for training a model in deep learning. We call this type of “package” a data collection and its aggregation and metadata description is motivated by research interests. The other type of “packages” relevant for CWFR are supposed to represent workflows in a self-describing and self-contained way for later execution. In this paper, we will review different packaging technologies and investigate their usability in the context of CWFR. For this purpose, we draw on an exemplary use case and show how packaging technologies can support its realization. We conclude that packaging technologies of different flavors help on providing inputs and outputs for workflow steps in a machine-readable way, as well as on representing a workflow and all its artifacts in a self-describing and self-contained way

    Foundations of Modern Language Resource Archives

    Get PDF
    A number of serious reasons will convince an increasing amount of researchers to store their relevant material in centers which we will call "language resource archives". They combine the duty of taking care of long-term preservation as well as the task to give access to their material to different user groups. Access here is meant in the sense that an active interaction with the data will be made possible to support the integration of new data, new versions or commentaries of all sort. Modern Language Resource Archives will have to adhere to a number of basic principles to fulfill all requirements and they will have to be involved in federations to create joint language resource domains making it even more simple for the researchers to access the data. This paper makes an attempt to formulate the essential pillars language resource archives have to adhere to

    Canonical Workflow Framework for Research (CWFR) - position paper - version 2 December 2020

    Get PDF
    With this paper we want to describe the motivation and basic ideas behind CWFR. Two working meetings were held to discuss the CWFR concept and to relate it with other work around “workflows” that has already been done. We intend to further develop this paper dependent on the growing insights based on the discussions and interactions we are planning to organise

    Connecting Repositories to one Integrated Domain

    Get PDF
    Information is the new commodity in the global economy and trustworthy digital repositories will be the key pillars within this new ecosystem. The value of this digital information will only be realised if these repositories can be interacted with in a consistent manner and their data accessible and understandable globally. Establishing a data interoperability layer is the goal of the emerging domain of Digital Objects. When considering how to proceed with designing this interoperability layer, it is important to state that repositories need to be considered from two different perspectives:Repositories are a reflection of the institutions that make them operational (quality of service, skilled experts, accessible over many years, appropriate data management procedures).Repositories are computational services that provide a specific set of functions.Complicating the effort to make repositories accessible and interoperable across the global is that many existing repositories have been developed in the past decades using a wide range of heterogeneous technologies, organisation of data and functionality. Many of these repositories are their own data silos and not interoperable. It is important to realise that much money has been invested to build these repositories and therefore we cannot expect that they will make large changes without great incentives and funding. This heterogeneity is the core of the challenge in making digital information the new commodity in the emerging global domain of digital objects.This paper will focus on the functional aspects of repositories and proposes the FAIR Digital Object model as a core data model for describing digital information and the use of the Digital Object Interface Protocol (DOIP) to establish interoperable communication with all repositories independently of the respective technical choices. It is the conviction of this paper’s authors that this integration of the FDO model and DOIP with existing repositories can be performed with minimal effort and we will present examples that document this claim.We will present three examples of existing integration in this paper:An integration of B2SHAREA CORDRA repositoryIntegration of the DOBES archiveB2SHARE is a repository that has assigned Persistent Identifiers (PIDs) (Handles) to all of its digital files. It allows users to add metadata according to a unified schema, but also has the possibility for user communities to extend this schema. The API allows one to specify a Handle which then gives access to the metadata and/or the bit sequences of the DO. It should be noted that B2SHARE allows one to include a set of bit-sequences being linked with the Handle. The integration consists of building a proxy that would provide a DOIP interface to B2SHARE to streamline the integration of the data and metadata into a single DO. The development of the proxy was relatively simple and did not require any changes on behalf of the B2SHARE repository. CORDRA is a CNRI repository/registry/registration system that manages DO, assigns Handles to all its DOs and is accessible through DOIP. For all intents and purposes, it implements many of the features from the Digital Object Architecture.The integration of the two repositories enables copying files or movíng digital objects. In the case of copying files (metadata and bit sequences) from B2SHARE to CORDRA, for example, all functionality of the CORDRA service such as searching would become possible. Important is that in this case the PID record identifying the digital object in the B2SHARE repository would have to be extended to point to the alternative path, and the API of B2SHARE would have to offer the alternative access paths to a client. This latter aspect has not been implemented. Moving a DO from B2SHARE to CORDRA would result in changing the ownership of the PID and adding the updated information about the DO.This adaptation was not done yet, but since this archive has some special functionalities, it is interesting to discuss the way of adaptation which could be chosen. In the DOBES archive each bundle of closely related digital objects is assigned a Handle and also metadata is treated as a digital object, i.e., it has a separate Handle. For management reasons and especially for enabling different contributors to maintain control of access rights, a tree structure was developed to allow contributors to organise their data according to specific criteria and users to browse the archive in addition to execute searches on the metadata.While accessing archival objects is comparatively simple, the ingest/upload feature is more complex. It should be noted that the archive supports establishing a canonical tree of resources to define scopes for authorisation (define who has the right to grant access permissions, etc.), and facilitating lookup by supporting browsing according to understandable criteria. Therefore, depositors need to specify where in the tree the new resources should be integrated, and which initial rights are associated with them. After uploading the gathered information into a workspace, the archive carries out many checks in a micro-workflow: metadata is checked against vocabularies and partly curated, types of bit-sequences are checked and aligned with the information in the metadata, etc. An operation has been developed which is called gatekeeper to ensure a highly consistent archive despite the many (remote) people contributing to its content. Thus, the archive requires a set of 4 information units being specified:the set of bit-sequences to be uploaded,the metadata describing the bundle,the node to be used to organise the resources andthe initial rights where the default would be “open”.Adapting this archive to DOIP would imply that the proxy provides a set of operations such as “ingest a complex object”, “update metadata”, “add another bit-sequence to a specific object”, “get me the list of operations”, “give me the metadata”, etc. A client must be developed to do the front-end interaction with a user allowing them to specify the required information and to choose a suitable operation. Then the client would have to interact with the repository via DOIP by starting, for example, the gatekeeper as an external operation

    Foundation of a Component-based Flexible Registry for Language Resources and Technology

    Get PDF
    Within the CLARIN e-science infrastructure project it is foreseen to develop a component-based registry for metadata for Language Resources and Language Technology. With this registry it is hoped to overcome the problems of the current available systems with respect to inflexible fixed schema, unsuitable terminology and interoperability problems. The registry will address interoperability needs by refering to a shared vocabulary registered in data category registries as they are suggested by ISO
    corecore